20 research outputs found
Music Information Retrieval Meets Music Education
This paper addresses the use of Music Information Retrieval (MIR) techniques in music education and their integration in learning software. A general overview of systems that are either commercially available or in research stage is presented. Furthermore, three well-known MIR methods used in music learning systems and their state-of-the-art are described: music transcription, solo and accompaniment track creation, and generation of performance instructions. As a representative example of a music learning system developed within the MIR community, the Songs2See software is outlined. Finally, challenges and directions for future research are described
Improving semi-supervised learning for audio classification with FixMatch
Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement
Mit maschinellem Lernen zur intelligenten Produktion: Vortrag gehalten auf dem Technologietag, 9. Oktober 2018, Erfurt
Welches Potenzial Maschinelles Lernen für Produktionsprozesse birgt, erklärte Machine-Learning-Wissenschaftler Sascha Grollmisch vom Fraunhofer IDMT den Zuhörern. »Der Einsatz maschineller Lernverfahren ist ein mächtiges Werkzeug, wenn es darum geht, eine Produktionsstrecke intelligent zu machen«, so Grollmisch. Systeme des Maschinellen Lernens wurden am Fraunhofer IDMT bisher sehr erfolgreich in der Videoanalyse, Spracherkennung und Musikanalyse angewendet. Ziel der Entwicklungsarbeiten des Instituts sei es, ein System zu schaffen, welches eigenständig aus akustischen Messdaten lernt, die Qualität von Produktionsprozessen oder Produkten zu beurteilen. Grollmisch betonte dabei, dass es keine Universal-Erkenner-Lösung für alle Anwendungsszenarien gäbe. Vielmehr bedarf es einem individuellen Problemverständnis sowie Systemdesign zum sinnvollen Einsatz von geeigneten Machine Learning Methoden. Konsens der anschließenden Diskussion war, dass Vertrauen der Kunden in die Zuverlässigkeit von Machine Learning Verfahren und deren Ergebnisse geschaffen werden muss
Automatic Chord Recognition in Music Education Applications
In this work, we demonstrate the market-readiness of a recently published state-of-the-art chord recognition method, where automatic chord recognition is extended beyond major and minor chords to the extraction of seventh chords. To do so, the proposed chord recognition method was integrated in the Songs2See Editor, which already includes the automatic extraction of the main melody, bass line, beat grid, key, and chords for any musical recording
Hörbare Fehler. Überwachung von Maschinen und Produkten anhand akustischer Signale
Das menschliche Gehör ist erstaunlich leistungsfähig. Oftmals kann man bereits am Geräusch erkennen, ob ein Bauteil funktionsfähig oder fehlerhaft ist. Wissenschaftler am Fraunhofer IDMT versuchen Maschinen dieses "Hören" beizubringen, um mangelhafte Produkte während der Produktion zu identifizieren. Damit soll ein Beitrag zur automatischen Qualitätssicherung geleistet werden
Akustische Qualitätskontrolle mit künstlicher Intelligenz
Spätestens mit dem Zusatz »4.0« hat die Industrie in den vergangenen Jahren den Prozess der Digitalisierung gestartet und setzt nun die voranschreitende Automatisierung von Produktionsprozessen um. Die Vernetzung von Anlagen und Komponenten steht dabei im Vordergrund - Sensoren übernehmen zunehmend die Funktion der menschlichen Sinne und liefern dadurch vielfältige Potentiale zur Optimierung der Wertschöpfungskette. Entwicklungen aus den Bereichen Sensorik und Messtechnik bilden in Kombination mit Fortschritten bei maschinellen Lernverfahren die Grundlage für innovative Möglichkeiten der Automatisierung von Prozessen. In diesem Zusammenhang entwickelt das Fraunhofer-Institut für Digitale Medientechnologie IDMT aus Ilmenau akustische Verfahren zur Sicherstellung der Qualität von Prozessen und Produkten, welche auf unterschiedliche Anwendungen in der industriellen Produktion angepasst werden können. Dieser Artikel stellt die einzelnen Komponenten, welche für eine integrierte Systemlösung zur Luftschallanalyse notwendig sind, anhand eines Anwendungsfalls vor
IDMT-ISA-Compressed-Air Dataset
The IDMT-ISA-Compressed-Air (IICA) dataset aims to foster research in compressed air leak detection with acoustic emissions in the audible hearing range with recordings of air leaks in a simulated industrial compressed air network. The dataset contains recordings of multiple leak types with different types of industrial background noises played via external loudspeakers at two different volumes during the recording process.
Leak Types:
Vent Leak
Vent Leak Low Pressure
Tube Leak
Noise Types:
Lab Noise (no added background noise)
Hydraulic machine noise
Hydraulic machine noise, low volume
General factory workshop noise
General factory workshop noise, low volume
For each combination of leak and noise types, there were three recording sessions. During each session, four Earthworks M30 omnidirectional measurement microphones placed in different configurations recorded the acoustic emission of the compressed air network. Each recording session contains 128 files of 30 seconds each, corresponding to each combination of leak, noise and microphone.
Total Files: 5592
Sampling Rate: 48 kHz
Resolution: 32-bit
Mono Audio
See the above referenced paper and README contained with the data folder for further details
IDMT-SMT-Chords Dataset
The IDMT-SMT-CHORDS comprises of 16 MIDI generated audio files consists of various chord classes. Here we focused on chord voicings, which are commonly used on keyboard instruments and guitars. Based on this we categorized as Guitar and Non-Guitar instruments. We used several software instruments from Ableton Live and Garage Band to synthesize these MIDI files with various instruments such as piano, synthesizer pad, as well as acoustic and electric guitar.
File duration: 4.1 Hours
# Chord segments: 7398
# WAV files: 16
Chord duration: 2 seconds
BPM: 120
Time signature: 4/4
Sampling rate: 44.1KHz
Mono audio
Non-Guitar
The Non-Guitar files includes all chord types in all possible root note positions and inversions. For example, C Major triad chord is included with its two possible inversions C/E and C/G.
All non-guitar chord classes are listed below:
Major (+ 2 inversions)
Minor (+ 2 inversions)
Major 7 (+ 3 inversions)
Minor 7 (+ 3 inversions)
Power Chord - root and fifth note (+ 1 inversion)
Dominant 7 (+ 3 inversions)
Minor 7 flat 5 (+ 3 inversions)
This gives us 576 non-guitar chord classes.
Guitar
The guitar files where generated based on barŕe chord voicings with the root note located on the low E, A, and D strings. For example, to modeling major chord and it’s voicings we use open position E maj, A maj and D maj shape and move 12 steps (including octave at 12th fret) thereby we get 39 positions (13*3).
List of Guitar chord types:
Major (+ 2 voicings)
Minor (+ 2 voicings)
Major 7 (+ 2 voicings)
Minor 7 (+ 2 voicings)
Power Chord - root and fifth note (+ 2 voicings)
Dominant 7 (+ 2 voicings)
Minor 7 flat 5 (+ 2 voicings)
This gives us 273 different guitar chord classes